Unbiased Global Optimization of Lennard-Jones Clusters for < 201 by 
Conformational Space Annealing Method 
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We apply the conformational space annealing (CSA) method to the Lennard-Jones clusters and 
find all known lowest energy configurations up to 201 atoms, without using extra information of the 
problem such as the structures of the known global energy minima. In addition, the robustness of 
the algorithm with respect to the randomness of initial conditions of the problem is demonstrated 
by ten successful independent runs up to 183 atoms. Our results indicate that the CSA method is 
a general and yet efficient global optimization algorithm applicable to many systems. 
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Finding the global minimum (GM) of a given function, 
called the global optimization, is an important problem 
in various fields of science and engineering. One of the 
simplest algorithms for global optimization is the simu- 
lated annealing (SA) method J,], which has been applied 
to many systems. Although SA is very versatile in that 
it can be applied to many problems, the drawback is that 
its efficiency is usually much lower than problem specific 
algorithms. This is especially problematic for NP-hard 
problems such as protein folding or molecular cluster op- 
timizations. For this reason, it is important to find an 
algorithm which is as general as SA, and yet competitive 
with problem specific ones. 

Recently, a powerful global optimization method called 
conformational space annealing (CSA) was proposed 0, 
and applied extensively and exclusively to the protein 
folding problem |3, iJ, 0, H, 011, |3| ■ The benchmark tests 

IE yl have demonstrated that it can not only find the 
known GM conformations with less computations than 
existing algorithms, but also provide new GMs in some 
cases 

The CSA unifies the essential ingredients of three 
global optimization methods, Monte Carlo with mini- 
mization (MCM) genetic algorithm (GA) [13, and 
SA. First, as in MCM, we consider only the phase space of 
local minima, i.e. all configurations are energy-minimized 
by a local minimizer. Secondly, as in GA, we consider 
many configurations (called bank in CSA) collectively, 
and we perturb a subset of bank configurations (seeds) 
using other bank configurations. This procedure is simi- 
lar to mating in GA. However, in contrast to the typical 
mating procedure in GA, we often replace small portions 
of a seed with the corresponding parts of bank configu- 
rations in order to search the neighborhood of the seed 
configuration, as will be elaborated later. Finally, as in 
SA, we introduce a parameter I?cut, which plays the role 



of the temperature in SA. The diversity of sampling is di- 
rectly controlled in CSA by introducing a distance mea- 
sure between two configurations and comparing it with 
Dent, whereas in SA there are no such systematic con- 
trols. The value of Dcut is slowly reduced just as in SA, 
hence the name conformational space annealing. Main- 
taining the diversity of the population using a distance 
measure was also tried in the context of GA 12] , although 
no annealing was performed. 

It should be noted that although the CSA method has 
been applied, so far, only to the polypeptide systems, the 
structure of the algorithm is not specific to these systems. 
In fact, to apply the CSA to an optimization problem, 
only two things are necessary; a method for perturbing a 
seed configuration, and a distance measure between two 
configurations. This suggests that the CSA is a candi- 
date for a versatile and yet powerful global optimization 
method. In this Letter, we demonstrate it by applying 
the CSA to Lennard-Jones (LJ) clusters. 

The LJ cluster is the system consisting of identical 
atoms interacting by a pairwise LJ potential: 
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where N is the number of atoms and the distance 
between atoms i and j. We use the reduced unit where 
e = CT = 1. It is not only interesting as a model for 
heavy inert gases, but also serves as a popular benchmark 
system for optimization algorithms. In fact, despite the 
simple form of the interaction, finding the GM configu- 
ration has been a challenging pro blem even for small N 
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Many of the powerful global optimization algorithms 
applied to this system are specific to the LJ cluster. In 
particular, they use the information on the structure of 
the known GMs of LJ clusters, favoring closely packed 
ones. For example, many GMs for N < 147 were discov- 
ered for the first time in Ref. 0| by using icosahedrally 
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derived lattices. A similar strategy [3J| was utilized for 
N < 309. Although these approaches are powerful, they 
fail when the GM configuration is of an unexpected struc- 
ture. Also, since they are designed specifically for the LJ 
cluster problem, it is not clear if they can be applied to 
other systems. 

Recently there were a number of successful applica- 
tions of unbiased search methods to LJ clusters. One 
of the most powerful methods in this category is the 
basin-hopping method [isl Isij . Almost all GM known 
at present [33 were reproduced for N < 110, and those 
for N = 69, 78, 107 were updated. This is an impressive 
result indeed. However, there are certain magic num- 
bers (TV = 76, 77, 103, 104) where the GMs could not be 
found by directly optimizing the system consisting of N 
atoms. They could be found only by adding and sub- 
tracting an atom from the GM configurations of cluster 
size ± 1. In Ref. [13 1 a variant of the basin- hopping 
method was applied for A'^ < 110, which performed bet- 
ter than the original version, but the known lowest en- 
ergy minima for A^ = 75 — 77 could be found only 4, 2, 8 
times out of 1000 independent runs. Similarly, the un- 
biased search in Ref. [22| could not reproduce the GMs 
for N = 75 - 77, 98. In Ref. [13, all GMs for N < 150 
were reproduced with a unbiased search method. The 
same method was also applied for larger cluster sizes, 
and found some of the known lowest energy minima for 
A^ < 309 [13, Hi- A global optimization method which 
combines the idea of basin-hopping and the genetic algo- 
rithm was applied for the cluster size A^ < 309 but failed 
to reproduce the known minima [s^ for A^ = 185, 187. 

In this work, we apply the CSA to the LJ cluster prob- 
lem. We find all known GM configurations for A^ < 201 
[13 ■ In particular, for each LJ cluster for A^ < 183, 
we generate ten independent random configurations and 
succeed in finding the GMs for all cases without an ex- 
ception. This is an exhaustive test unprecedented in the 
literature, and shows that our algorithm is quite robust 
with respect to the change of initial conditions. 

To elaborate on the details of the CSA method, we 
first randomly generate a certain number of initial config- 
urations (50 in this work) whose energy is subsequently 
minimized. We call the set of these configurations the 
first bank. We make a copy of the first bank and call it 
the bank. The configurations in the bank are updated in 
later stages, whereas those in the first bank are kept un- 
changed. Also, the number of configurations in the bank 
is kept unchanged when the bank is updated. The initial 
value of £>cut is set as Dave/2 where Dave is the aver- 
age distance between the configurations in the first bank. 
New configurations are generated by choosing a certain 
number (20 in this work) of seed configurations from the 
bank and by replacing parts of the seeds by the corre- 
sponding parts of configurations randomly chosen from 
either the first bank or the bank. Random perturbations 
are also performed. In this work 20 and 10 configurations 



are generated for each seed using the partial replacements 
and random perturbations, respectively. Then the ener- 
gies of these configurations are subsequently minimized 
(trial configurations). 

A newly obtained local minimum configuration a is 
compared with those in the bank to decide how the bank 
should be updated. One first finds the configuration A in 
the bank which is closest to a with the distance D{a, A). 
If D{a, A) < Dcut, OL is considered as similar to A. In this 
case, the configuration with lower energy among a and A 
is kept in the bank, and the other one is discarded. How- 
ever, if D(a, A) > Dcut, a is regarded as distinct from all 
configurations in the bank. In this case, the configuration 
with the highest energy among the bank configurations 
plus a is discarded, and the rest are kept in the bank. 
We perform this operation for all trial configurations. 

This process of generating new conformations by per- 
turbation and subsequent local minimizations, and up- 
dating the bank, can be visualized follows (Fig.l). Each 
of the bank configurations can be considered to represent 
all local minima contained in the sphere with radius Dcut, 
centered on it. To improve a bank configuration A, we 
first select A as a seed. We perturb A and subsequently 
energy-minimize it to generate a trial configuration a. 
When a originates from A by small perturbation, it is 
likely that a is contained in a sphere centered on A. If 
a replaces A, the center of the sphere moves from A to 
a. If Q belongs to a different sphere centered on D, a 
can replace D in a similar manner. When a is outside 
of all existing spheres, a new sphere centered on a is 
generated. In this case, to keep the number of spheres 
fixed, we remove the sphere represented by the highest- 
energy configuration. Obviously, the former two cases 
are more likely to happen when the spheres are large, and 
the latter when spheres are small. Larger value of Dcut 
produces more diverse sampling, whereas smaller value 
results in quicker search of low-energy configurations at 
the expense of getting trapped in a basin probably far 
away from the GM. 

Therefore, for efficient sampling of the phase space, it 
is necessary to maintain the diversity of sampling in the 
early stages and then gradually shift the emphasis to- 
ward obtaining low energy configurations, by slowly re- 
ducing Dcut- In practice, the Dcut is reduced by a fixed 
ratio after the bank update has been attempted by all 
the newly generated trial configurations, in such a way 
that Dcut reaches Dave/5 after 10000 local minimizations. 
Then seeds are selected again from the bank configura- 
tions which have not been used as seeds yet, to repeat 
aforementioned procedure. The value of Dcut is kept con- 
stant after it reaches the final value. 

When the energy of a seed configuration does not im- 
prove after a fixed number of perturbations, we stop per- 
turbing it. To validate this judgment, it is important 
that typical perturbations are kept small, so that the 
perturbed configurations are close to their original seeds. 
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FIG. 1: Schematic figure showing the search procedure of 
CSA. The boxes represent the identical phase space, (a) Ini- 
tially, we cover the phase space by large spheres centered on 
randomly chosen local minima denoted by x , and replace the 
centers with lower-energy local minima. When A is replaced 
by a, the sphere moves in the direction of the arrow, (b) 
As the algorithm proceeds and the energies of the representa- 
tive configurations at the centers of the spheres are lowered, 
the size of the spheres are reduced and the search space is 
narrowed down to small basins of low-lying local minima. 

Hovifever, large perturbations are also performed, in order 
to efficiently sample various regions of the search space. 

It should be noted that in the early stages of CSA the 
seed configurations are continuously being replaced by 
low energy local minima close to it. Therefore, when all 
of the bank configurations are used as seeds (one iteration 
completed) , usually after tens of thousands of local mini- 
mizations, this implies that the procedure of updating the 
bank might have reached a deadlock. However, we give 
these configurations another chance by resetting them 
to be eligible for seeds, and repeat another iteration of 
search. After a preset number of iterations, we conclude 
that our procedure has reached a deadlock. When this 
happens, we enlarge the search space by adding more ran- 
dom configurations into the bank and repeat the whole 
procedure until the stopping criterion is met. In this 
work, after 3 iterations are completed, we increase the 
number of bank configurations by adding 50 randomly 
generated and minimized configurations into the bank 
(and also into the first bank), and reset Prut to -Davc/2. 
The algorithm stops when the known GM 'SS'l is found, 
which is examined after all the new trial configurations 
are used for possible bank updates. 

It should be noted that since one iteration is completed 
only after all bank configurations have been used as seeds, 
and we add random configurations whenever our search 
has reached a deadlock, there is no loss of generality for 
using particular values for the number of seeds, the num- 
ber of bank configurations, etc. 

We define the distance measure D{k,k') as follows. 
Given a configuration fc, we define the first and second 
shell centered on each atom as the spheres of radii 1.35 
and 1.70. These values of the radii are obtained from 
the radial distribution function, which is approximately 
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FIG. 2: CPU time (sec) of ten separate runs for cluster size 
< 201. The square boxes denote the average values and 
the error bars indicate the ranges of the values. The partially 
successful runs (A = 184, 188 - 192, 198, 199) are excluded 
(See text). The circles denote the result for N < 150 reported 
in Ref. T^l after rescaling of the CPU clocks. 

estimated from randomly generated local minimum con- 
figurations with N — 201, and do not depend much on 
N. We then obtain the histogram iJ'=(l,n) {H^l2,n)) 
which is the number of atoms having n atoms in the 
first (second) shell. The first and second coordination 
numbers contain the geometrical information of the lo- 
cal environment of each atom. Therefore, the histograms 
provide collective information of such local structures in 
a cluster. We put larger weight for the first coordination 
number than the second one since the former is more im- 
portant than the latter. In addition, larger weights are 
assigned to the histograms with larger n, since they cor- 
respond to the core part of a cluster. This motivates us 
to define D{k, k') as: 

D{k,k') = ^n(2|i/'=(l,n)-i7''"(l,n)| 

n 

+ \HH2,n)-H'^'{2,n)\y (2) 

It should be noted that the details of the definition of 
D{k,k') do not affect the overall performance. 

The method of perturbing a seed configuration s to 
generate a new configuration is as follows 25]. We first 
generate random planes passing through the center of 
mass of the seed s and another bank configuration k. We 
then choose from s a certain fraction (25 — 50%) of atoms 
which are farthest from the plane, and replace them by 
the corresponding counterpart in k. We also make a ran- 
dom rotation perpendicular to the face of the cut surface 
when we put together the two parts. The main difference 
from the method used in Ref. |25| is that we are making a 
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fractional replacement of a seed, for the reason discussed 
earlier. In addition, we also generate new configurations 
by choosing an atom in s with the lowest coordination 
number, and move it to the neighborhood of an atom 
with the second lowest coordination number. 

We could reproduce all published GM configurations 
[sif for < 201. To measure the overall efficiency of 
our algorithm, we performed exhaustive systematic runs. 
We performed ten independent CSA runs for each clus- 
ter size < 183, and found all known GMs for all ten 
cases, without an exception, which demonstrates the ro- 
bustness of the algorithm with respect to the randomness 
of the initial conditions. The same test was performed 
for 184 < TV < 201, but for = 184, 188 - 192, 198, 199 
the optimization was only partially successful (4-9 out of 
10). The average values and fluctuations of CPU times 
for these runs are shown in Fig. |21 The computations 
were carried out on Athlon processors (1.667 GHz), and 
the limited-memory quasi-Newton method |39l| was used 
for local minimization. The minimization stopped when- 
ever \G\/V3N < 0.001, where G is the gradient. A sim- 
ilar plot was presented in Ref. for N < 150, where 
the identical local minimizer was used. These results are 
also included in Fig. [5] for comparison, after the rescaling 
of the CPU clocks. The result suggests that our algo- 
rithm is faster on average for the sizes where the data is 
available, although a rigorous conclusion is difficult to be 
drawn due to the possible technical differences such as 
the stopping criteria of the local minimizer, and the fact 
that our result is the average of ten independent runs. 

The CSA method is now also being applied to quite 
different kinds of problems such as the traveling salesman 
problems, spin glasses, not to mention complex molecular 
clusters. For example, the shortest path of ATT532 ^3 
was found for all 100 independent runs, whose results 
will be reported elsewhere. This suggests that the CSA 
method is a general and yet efficient global optimization 
algorithm applicable to many systems. As is the case 
for GA, the CSA can also be easily adapted for efficient 
parallel computation0,0|. 
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